71 research outputs found

    Protein sectors: statistical coupling analysis versus conservation

    Full text link
    Statistical coupling analysis (SCA) is a method for analyzing multiple sequence alignments that was used to identify groups of coevolving residues termed "sectors". The method applies spectral analysis to a matrix obtained by combining correlation information with sequence conservation. It has been asserted that the protein sectors identified by SCA are functionally significant, with different sectors controlling different biochemical properties of the protein. Here we reconsider the available experimental data and note that it involves almost exclusively proteins with a single sector. We show that in this case sequence conservation is the dominating factor in SCA, and can alone be used to make statistically equivalent functional predictions. Therefore, we suggest shifting the experimental focus to proteins for which SCA identifies several sectors. Correlations in protein alignments, which have been shown to be informative in a number of independent studies, would then be less dominated by sequence conservation.Comment: 36 pages, 17 figure

    Charge as a Selection Criterion for Translocation through the Nuclear Pore Complex

    Get PDF
    Nuclear pore complexes (NPCs) are highly selective filters that control the exchange of material between nucleus and cytoplasm. The principles that govern selective filtering by NPCs are not fully understood. Previous studies find that cellular proteins capable of fast translocation through NPCs (transport receptors) are characterized by a high proportion of hydrophobic surface regions. Our analysis finds that transport receptors and their complexes are also highly negatively charged. Moreover, NPC components that constitute the permeability barrier are positively charged. We estimate that electrostatic interactions between a transport receptor and the NPC result in an energy gain of several kBT, which would enable significantly increased translocation rates of transport receptors relative to other cellular proteins. We suggest that negative charge is an essential criterion for selective passage through the NPC.Merck Research LaboratoriesNational Science Foundation (U.S.) (Division of Mathematical Sciences)Kavli Institute for Bionano Science & Technology at Harvard UniversityNational Centers for Systems Biology (U.S.) (NIGMS grant GM068763)National Institute of General Medical Sciences (U.S.

    Conservation Weighting Functions Enable Covariance Analyses to Detect Functionally Important Amino Acids

    Get PDF
    The explosive growth in the number of protein sequences gives rise to the possibility of using the natural variation in sequences of homologous proteins to find residues that control different protein phenotypes. Because in many cases different phenotypes are each controlled by a group of residues, the mutations that separate one version of a phenotype from another will be correlated. Here we incorporate biological knowledge about protein phenotypes and their variability in the sequence alignment of interest into algorithms that detect correlated mutations, improving their ability to detect the residues that control those phenotypes. We demonstrate the power of this approach using simulations and recent experimental data. Applying these principles to the protein families encoded by Dscam and Protocadherin allows us to make testable predictions about the residues that dictate the specificity of molecular interactions

    Inferring interaction partners from protein sequences.

    Get PDF
    Specific protein-protein interactions are crucial in the cell, both to ensure the formation and stability of multiprotein complexes and to enable signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interaction partners, causing their sequences to be correlated. Here we exploit these correlations to accurately identify, from sequence data alone, which proteins are specific interaction partners. Our general approach, which employs a pairwise maximum entropy model to infer couplings between residues, has been successfully used to predict the 3D structures of proteins from sequences. Thus inspired, we introduce an iterative algorithm to predict specific interaction partners from two protein families whose members are known to interact. We first assess the algorithm's performance on histidine kinases and response regulators from bacterial two-component signaling systems. We obtain a striking 0.93 true positive fraction on our complete dataset without any a priori knowledge of interaction partners, and we uncover the origin of this success. We then apply the algorithm to proteins from ATP-binding cassette (ABC) transporter complexes, and obtain accurate predictions in these systems as well. Finally, we present two metrics that accurately distinguish interacting protein families from noninteracting ones, using only sequence data.Human Frontier Science Program, National Institutes of Health (Grant ID: R01-GM082938), National Science Foundation (Grant ID: PHY-1305525), Marie Curie (Career Integration Grant ID: 631609), Next Generation Fellowship, Eric and Wendy Schmidt Transformative Technology FundThis is the author accepted manuscript. The final version is available from the Proceedings of the National Academy of Sciences of the United States of America via https://doi.org/10.1073/pnas.160676211

    Optimal Design of Experiments by Combining Coarse and Fine Measurements.

    Get PDF
    In many contexts, it is extremely costly to perform enough high-quality experimental measurements to accurately parametrize a predictive quantitative model. However, it is often much easier to carry out large numbers of experiments that indicate whether each sample is above or below a given threshold. Can many such categorical or "coarse" measurements be combined with a much smaller number of high-resolution or "fine" measurements to yield accurate models? Here, we demonstrate an intuitive strategy, inspired by statistical physics, wherein the coarse measurements are used to identify the salient features of the data, while the fine measurements determine the relative importance of these features. A linear model is inferred from the fine measurements, augmented by a quadratic term that captures the correlation structure of the coarse data. We illustrate our strategy by considering the problems of predicting the antimalarial potency and aqueous solubility of small organic molecules from their 2D molecular structure.. L. J. C. acknowledges a Next Generation fellowship and a Marie Curie CIG [Evo-Couplings, Grant No. 631609]. M. P. B. acknowledges support from the Simons Foundation and from the National Science Foundation through DMS-1715477

    Proline provides site-specific flexibility for in vivo collagen.

    Get PDF
    Fibrillar collagens have mechanical and biological roles, providing tissues with both tensile strength and cell binding sites which allow molecular interactions with cell-surface receptors such as integrins. A key question is: how do collagens allow tissue flexibility whilst maintaining well-defined ligand binding sites? Here we show that proline residues in collagen glycine-proline-hydroxyproline (Gly-Pro-Hyp) triplets provide local conformational flexibility, which in turn confers well-defined, low energy molecular compression-extension and bending, by employing two-dimensional 13C-13C correlation NMR spectroscopy on 13C-labelled intact ex vivo bone and in vitro osteoblast extracellular matrix. We also find that the positions of Gly-Pro-Hyp triplets are highly conserved between animal species, and are spatially clustered in the currently-accepted model of molecular ordering in collagen type I fibrils. We propose that the Gly-Pro-Hyp triplets in fibrillar collagens provide fibril "expansion joints" to maintain molecular ordering within the fibril, thereby preserving the structural integrity of ligand binding sites.BBSRC, EPSRC, Raymond and Beverly Sackler Fund for Physics of Medicine, Wellcome Trust, ER
    • …
    corecore